Evaluation of Two Parallel Finite Element Implementations of the Time-Dependent Advection Diffusion Problem: GPU versus Cluster Considering Time and Energy Consumption

نویسندگان

Alberto Ferreira de Souza

Lucas de Paula Veronese

Leonardo M. Lima

Claudine Badue

Lucia Catabriga

چکیده

We analyze two parallel finite element implementations of the 2D time-dependent advection diffusion problem, one for multi-core clusters and one for CUDA-enabled GPUs, and compare their performances in terms of time and energy consumption. The parallel CUDA-enabled GPU implementation was derived from the multi-core cluster version. Our experimental results show that a desktop machine with a single CUDA-enabled GPU can achieve performance higher than a 24-machine (96 cores) cluster in this class of finite element problems. Also, the CUDA-enabled GPU implementation consumes less than one twentieth of the energy (Joules) consumed by the multi-core cluster implementation while solving a whole instance of the finite element problem.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Two-dimensional advection-dispersion equation with depth- dependent variable source concentration

The present work solves two-dimensional Advection-Dispersion Equation (ADE) in a semi-infinite domain. A variable source concentration is regarded as the monotonic decreasing function at the source boundary (x=0). Depth-dependent variables are considered to incorporate real life situations in this modeling study, with zero flux condition assumed to occur at the exit boundary of the domain, i.e....

متن کامل

Two-dimensional advection-dispersion equation with depth- dependent variable source concentration

متن کامل

Implementation of the direction of arrival estimation algorithms by means of GPU-parallel processing in the Kuda environment (Research Article)

Direction-of-arrival (DOA) estimation of audio signals is critical in different areas, including electronic war, sonar, etc. The beamforming methods like Minimum Variance Distortionless Response (MVDR), Delay-and-Sum (DAS), and subspace-based Multiple Signal Classification (MUSIC) are the most known DOA estimation techniques. The mentioned methods have high computational complexity. Hence using...

متن کامل

Efficient implementation of low time complexity and pipelined bit-parallel polynomial basis multiplier over binary finite fields

This paper presents two efficient implementations of fast and pipelined bit-parallel polynomial basis multipliers over GF (2m) by irreducible pentanomials and trinomials. The architecture of the first multiplier is based on a parallel and independent computation of powers of the polynomial variable. In the second structure only even powers of the polynomial variable are used. The par...

متن کامل

A bi-objective model for the assembly flow shop scheduling problem with sequence dependent setup times and considering energy consumption

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2012

Evaluation of Two Parallel Finite Element Implementations of the Time-Dependent Advection Diffusion Problem: GPU versus Cluster Considering Time and Energy Consumption

نویسندگان

چکیده

منابع مشابه

Two-dimensional advection-dispersion equation with depth- dependent variable source concentration

Two-dimensional advection-dispersion equation with depth- dependent variable source concentration

Implementation of the direction of arrival estimation algorithms by means of GPU-parallel processing in the Kuda environment (Research Article)

Efficient implementation of low time complexity and pipelined bit-parallel polynomial basis multiplier over binary finite fields

A bi-objective model for the assembly flow shop scheduling problem with sequence dependent setup times and considering energy consumption

عنوان ژورنال:

اشتراک گذاری